A Dataset and Evaluation Metrics for Abstractive Compression of Sentences and Short Paragraphs
نویسندگان
چکیده
We introduce a manually-created, multireference dataset for abstractive sentence and short paragraph compression. First, we examine the impact of singleand multi-sentence level editing operations on human compression quality as found in this corpus. We observe that substitution and rephrasing operations are more meaning preserving than other operations, and that compressing in context improves quality. Second, we systematically explore the correlations between automatic evaluation metrics and human judgments of meaning preservation and grammaticality in the compression task, and analyze the impact of the linguistic units used and precision versus recall measures on the quality of the metrics. Multi-reference evaluation metrics are shown to offer significant advantage over single reference-based metrics.
منابع مشابه
A New Sentence Compression Dataset and Its Use in an Abstractive Generate-and-Rank Sentence Compressor
Sentence compression has attracted much interest in recent years, but most sentence compressors are extractive, i.e., they only delete words. There is a lack of appropriate datasets to train and evaluate abstractive sentence compressors, i.e., methods that apart from deleting words can also rephrase expressions. We present a new dataset that contains candidate extractive and abstractive compres...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملGénération de résumés par abstraction complète
This Ph.D. thesis is the result of several years of research on automatic text summarization. Three major contributions are presented in the form of published and yet to be published papers. They follow a path that moves away from extractive summarization and toward abstractive summarization. The first article describes the HexTac experiment, which was conducted to evaluate the performance of h...
متن کاملUsing the Omega Index for Evaluating Abstractive Community Detection
Numerous NLP tasks rely on clustering or community detection algorithms. For many of these tasks, the solutions are disjoint, and the relevant evaluation metrics assume nonoverlapping clusters. In contrast, the relatively recent task of abstractive community detection (ACD) results in overlapping clusters of sentences. ACD is a sub-task of an abstractive summarization system and represents a tw...
متن کاملAbstractive Compression of Captions with Attentive Recurrent Neural Networks
In this paper we introduce the task of abstractive caption or scene description compression. We describe a parallel dataset derived from the FLICKR30K and MSCOCO datasets. With this data we train an attention-based bidirectional LSTM recurrent neural network and compare the quality of its output to a Phrasebased Machine Translation (PBMT) model and a human generated short description. An extens...
متن کامل